{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "scikit-learn is a machine learning library for python, with a very easy to use API and great documentation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "from __future__ import print_function\n",
    "%matplotlib inline\n",
    "import mdtraj as md\n",
    "import matplotlib.pyplot as plt\n",
    "from sklearn.decomposition import PCA"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "Lets load up our trajectory. This is the trajectory that we generated in\n",
    "the \"Running a simulation in OpenMM and analyzing the results with mdtraj\"\n",
    "example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "traj = md.load('ala2.h5')\n",
    "traj"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Create a two component PCA model, and project our data down into this\n",
    "reduced dimensional space. Using just the cartesian coordinates as\n",
    "input to PCA, it's important to start with some kind of alignment."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "pca1 = PCA(n_components=2)\n",
    "traj.superpose(traj, 0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "reduced_cartesian = pca1.fit_transform(traj.xyz.reshape(traj.n_frames, traj.n_atoms * 3))\n",
    "print(reduced_cartesian.shape)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we can plot the data on this projection."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plt.figure()\n",
    "plt.scatter(reduced_cartesian[:, 0], reduced_cartesian[:,1], marker='x', c=traj.time)\n",
    "plt.xlabel('PC1')\n",
    "plt.ylabel('PC2')\n",
    "plt.title('Cartesian coordinate PCA: alanine dipeptide')\n",
    "cbar = plt.colorbar()\n",
    "cbar.set_label('Time [ps]')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Lets try cross-checking our result by using a different feature space that isn't sensitive to alignment, and instead to \"featurize\" our trajectory by computing the pairwise distance between every atom  in each frame, and using that as our high dimensional input space for PCA."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "pca2 = PCA(n_components=2)\n",
    "\n",
    "from itertools import combinations\n",
    "# this python function gives you all unique pairs of elements from a list\n",
    "\n",
    "atom_pairs = list(combinations(range(traj.n_atoms), 2))\n",
    "pairwise_distances = md.geometry.compute_distances(traj, atom_pairs)\n",
    "print(pairwise_distances.shape)\n",
    "reduced_distances = pca2.fit_transform(pairwise_distances)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plt.figure()\n",
    "plt.scatter(reduced_distances[:, 0], reduced_distances[:,1], marker='x', c=traj.time)\n",
    "plt.xlabel('PC1')\n",
    "plt.ylabel('PC2')\n",
    "plt.title('Pairwise distance PCA: alanine dipeptide')\n",
    "cbar = plt.colorbar()\n",
    "cbar.set_label('Time [ps]')"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
